Cypress: Managing Massive Time Series Streams with Multi-Scale Compressed Trickles

نویسندگان

  • Galen Reeves
  • Jie Liu
  • Suman Nath
  • Feng Zhao
چکیده

We present Cypress, a novel framework to archive and query massive time series streams such as those generated by sensor networks, data centers, and scientific computing. Cypress applies multi-scale analysis to decompose time series and to obtain sparse representations in various domains (e.g. frequency domain and time domain). Relying on the sparsity, the time series streams can be archived with reduced storage space. We then show that many statistical queries such as trend, histogram and correlations can be answered directly from compressed data rather than from reconstructed raw data. Our evaluation with server utilization data collected from real data centers shows significant benefit of our framework.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Managing Massive Time Series Streams with MultiScale Compressed Trickles

We present Cypress, a novel framework to archive and query massive time series streams such as those generated by sensor networks, data centers, and scientific computing. Cypress applies multi-scale analysis to decompose time series and to obtain sparse representations in various domains (e.g. frequency domain and time domain). Relying on the sparsity, the time series streams can be archived wi...

متن کامل

Massive Data Streams Research: Where to Go

This phenomenon has challenged how we store, communicate and compute with data. Theories developed over past 50 years have relied on full capture, storage and communication of data. Instead, what we need for managing modern massive data streams are new methods built around working with less. The past 10 years have seen new theories emerge in computing (data stream algorithms), communication (co...

متن کامل

On Clustering Graph Streams

In this paper, we will examine the problem of clustering massive graph streams. Graph clustering poses significant challenges because of the complex structures which may be present in the underlying data. The massive size of the underlying graph makes explicit structural enumeration very difficult. Consequently, most techniques for clustering multi-dimensional data are difficult to generalize t...

متن کامل

A Sketch-based Clustering Algorithm for Uncertain Data Streams

Due to the inaccuracy and noisy, uncertainty is inherent in time series streams, and increases the complexity of streams clustering. For the continuous arriving and massive data size, efficient data storage is a crucial task for clustering uncertain data streams. With hash-compressed structure, an extended uncertain sketch and update strategy are proposed to store uncertain data streams. And ba...

متن کامل

SCAN: Spectral Compressed Analysis for Monitoring Evolving Multi-Relational Social Networks

We propose SCAN, an innovative, spectral analysis framework for internet scale monitoring of multi-relational social media data, encoded in the form of tensor streams. In particular, a significant challenge is to detect key changes in the social media data, which could reflect important events in the real world, sufficiently quickly. Social media data have three challenging characteristics. Fir...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009